Language independent search in MediaEval's Spoken Web Search task

نویسندگان

  • Florian Metze
  • Xavier Anguera Miró
  • Etienne Barnard
  • Marelie H. Davel
  • Guillaume Gravier
چکیده

In this paper, we describe several approaches to language-independent spoken term detection and compare their performance n a common task, namely “Spoken Web Search”. The goal of this part of the MediaEval initiative is to perform low-resource anguage-independent audio search using audio as input. The data was taken from “spoken web” material collected over mobile hone connections by IBM India as well as from the LWAZI corpus of African languages. As part of the 2011 and 2012 MediaEval enchmark campaigns, a number of diverse systems were implemented by independent teams, and submitted to the “Spoken Web earch” task. This paper presents the 2011 and 2012 results, and compares the relative merits and weaknesses of approaches developed y participants, providing analysis and directions for future research, in order to improve voice access to spoken information in low esource settings. 2014 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LIA @ MediaEval 2013 Spoken Web Search Task: An I-Vector based Approach

In this paper, we describe the LIA system proposed for the MediaEval 2013 Spoken Web Search task. This multilanguage task involves searching for an audio content query, in a database, with no training resources available. The participants must then find locations of each given query term within a large database of untranscribed audio files. For this task, we propose to build a language-independ...

متن کامل

The Spoken Web Search Task at MediaEval 2011

In this paper, we describe the “Spoken Web Search” Task, which was held as part of the 2011 MediaEval benchmark campaign. The purpose of this task was to perform audio search with audio input in four languages, with very few resources being available in each language. The data was taken from “spoken web” material collected over mobile phone connections by IBM India. We present results from seve...

متن کامل

Spoken Web Search

In this paper, we describe the “Spoken Web Search” Task, which was held as part of the 2011 MediaEval campaign. The purpose of this task was to perform audio search in several languages, with very little resources being available in each language. The data was taken from audio content that was created in live settings and was submitted to the “spoken web” over a mobile connection.

متن کامل

Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type

Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...

متن کامل

MediaEval 2013 Spoken Web Search Task: System Performance Measures

This document discusses how to measure system performance in the Spoken Web Search (SWS) task at MediaEval 2013. The discussion is based on different sources, including the NIST 2006 Spoken Term detection (STD) Evaluation Plan [1], the NIST 2010 Speaker Recognition Evaluation (SRE) Plan [2], the description of the scoring criteria applied in the SWS task at Mediaeval 2012 [3], the Albayzin 2012...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computer Speech & Language

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2014